A Multiple-Stage Framework for Related Entity Finding: FDWIM at TREC 2010 Entity Track
نویسندگان
چکیده
This paper describes a multiple-stage retrieval framework for the task of related entity finding on TREC 2010 Entity Track. In the document retrieval stage, search engine is used to improve the retrieval accuracy. In the entity extraction and filtering stage, we extract entity with NER tools, Wikipedia and text pattern recognition. Then stoplist and other rules are employed to filter entity. Deep mining of the authority pages is proved to be effective in this stage. In entity ranking stage, many factors including keywords from narrative, page rank, combined results of corpus-based association rules and search engine are considered. In the final stage, an improved feature-based algorithm is proposed for the entity homepage detection.
منابع مشابه
Purdue at TREC 2010 Entity Track: A Probabilistic Framework for Matching Types Between Candidate and Target Entities
This paper gives an overview of our work for the TREC 2010 Entity track. The goal of the TREC Entity track is to study entity-related searches on Web data, which has not been sufficiently addressed in prior research. For both the Related Entity Finding (REF) task and the Entity List Completion (ELC) task in this track, we propose a unified probabilistic framework by incorporating the matching b...
متن کاملLADS: Rapid Development of a Learning-To-Rank Based Related Entity Finding System using Open Advancement
In this paper, we present our system called LADS, tailored to work on the TREC Entity Track Task of Related Entity Finding. The LADS system consists of four key components: document retrieval, entity extraction, feature extraction and entity ranking. We adopt the open advancement framework for the rapid development and use a learning-to-rank approach to rank candidate entities. We also experime...
متن کاملA Novel Framework for Related Entities Finding: ICTNET at TREC 2009 Entity Track
This paper addresses the problem of related entity finding, which was proposed in trec 2009. The overall aim of related entity finding (REF) is to perform entity-related search on Web data, which address common information needs that are not that well modeled as ad hoc document search. In this paper, a novel framework was proposed based on a probabilistic model for related entity finding in a W...
متن کاملNiCT at TREC 2010: Related Entity Finding
This paper describes experiments carried out at NiCT for the TREC 2010 Entity track. Our studies mainly focus on improving the NE Extraction and Ranking Entity modules, both of them play vital roles in Related Entity Finding system. In our last year’s system, only a Named Entity Recognition tool is used to extract entities that match coarse-grained types of target entities such as organization,...
متن کامل